Intro to NGS processing

James A. Fellows Yates

2021-08-17

Who am I?

  • Education
    • B.Sc. Bioarchaeology (University of York, UK)
    • M.Sc. Naturwissenschaftliches Archäologie (University of Tübingen, DE)
    • Ph.D. Archaeogenetics (MPI-SHH / MPI-EVA, DE)
  • Experience
    • Number of genetics classes taken: 0
    • Number of bioinformatics classes taken: 0

@jfy133

Today we will

  1. Introduce what DNA sequencing is
  2. Explain how Illumina NGS sequencing data is generated
  3. How to evaluating NGS data [Practical]

Introduction DNA

What is DNA?

Deoxyribonucleic acid (/diːˈɒksɪˌraɪboʊnjuːˌkliːɪk, -ˌkleɪ-/ (DNA) is a molecule composed of two polynucleotide chains that coil around each other to form a double helix carrying genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. - Wikipedia

What is DNA?

Structure ADN

What is DNA?

Structure ADN

The rules

  • Four nucleotides
    • Pyrimidines: Cytosine, Thymine
    • Purines: Guanine Adenine &
  • Base pairing: one pyrimidine with one purine
    • C with G (think: CGI)
    • A to T (think: AT-AT walker)
  • Complementary
    • C on one strand, G on the other (or v.v.)
    • A on one strand, T on the other (or v.v.)

The rules

  • Make copy of a DNA strand with a polymerase
    • Unwind the DNA
    • Separate the strands
    • Make new strand: find a C, get new G (etc)

DNA replication split

How do we get DNA?

Figure 17 01 02

Introduction to DNA Sequencing

What is Sequencing?

Converting the chemical nucleotides of a DNA molecule

to

ACTG on your computer screen

Historically

  • Sanger sequencing

Sanger-sequencing

  • Sanger sequencing
    • Separate strands, add primer (starting point)
    • Add random mix of nucleotides, but some with special ‘terminators’
    • Pass through size-filtering, read order of terminators

Pros and cons of Sanger Sequencing

  • Pros
    • More precise (less errors)
    • Longer reads
  • Cons
    • Resource heavy: lot of input DNA
    • Slow: one fragment at a time

What is NGS?

NGS: Next Generation Sequencing

  • MASSIVELY multiplexed!
  • Sequence millions and millions and millions and millions of DNA reads at once!

Not really ‘next’ anymore, consider it more ‘second’ generation (see: Nanopore)

Market leader:

How does it work?

Basically same concept, but